Skip to main content

Linear Transformations

A linear transformation is a function that maps vectors to vectors in a way that preserves the vector space structure. In simpler terms, a linear transformation preserves vector addition and scalar multiplication. One way to conceptualize this is to think of a linear transformation as a function that stretches, compresses, rotates, or reflects vectors in a vector space. The transformation has to preserve the structure of the space, which means that it cannot "tear" or "break" vectors in the process.

A result of this is that after applying a linear transformation, the gridlines in the vector space will remain parallel and evenly spaced. This property is crucial in many applications of linear algebra, such as computer graphics, physics, and engineering.

A more formal way to define a linear transformation is to say that a function is a linear transformation if it satisfies the following two properties:

  1. Additivity: for all vectors .
  2. Homogeneity: for all vectors and all scalars .

Table of Contents

Conceptual Introduction

To understand linear transformations better, let's consider a simple example.

Consider a 2D vector space, , with the standard basis vectors and .

To describe a linear transformation, we can't possibly describe how it affects every single vector in the space. Instead, due to the linearity of the transformation, it suffices to describe how it affects the basis vectors.

So, if you know how it affects and , you can figure out how it affects any vector in the space.

Consider a transformation that maps to and to . Let's first visualize this transformation:

Recall that a vector in can be written as a linear combination of the basis vectors and . So, any vector can be written as .

To find how the transformation affects , we can find how it affects and and then combine the results. For example, if , then:

Notice how we were able to find the transformation of by only knowing how the transformation affects the basis vectors. This relies on the linearity of the transformation, which will be proven later.

Terminology and Notation

Before we dive deeper into linear transformations, let's define some things.

The notation we use for representing functions can be a bit confusing.

Consider a function .

  • The domain of the function is the set that you input to the function - in this case, .
  • The codomain of the function is the set that the function outputs - in this case, .
  • The image or range of a set under the function is the set of all outputs of the function when you input elements of .

Functions have unique outputs for each input, so the image of a set under a function is unique.

Next, we need to clarify on the difference between the image and codomain of a function.

  • The image or range of a function is the set of all possible outputs of the function.
  • The codomain of a function is the set that the function outputs to. It is where the range "lives".

To illustrate this difference, consider a function .

Its codomain is , as the function outputs real numbers. However, the image of the function is , as the function only outputs non-negative real numbers.

In this way, the image is a subset of the codomain.

Matrix Representation

Since linear transformations are so important in linear algebra, it's useful to have a way to represent them in a more compact form. This is where matrices come in.

Let be a linear transformation that maps to and to . Then, the transformation can be represented by the matrix:

A matrix is, fundamentally, simply a two-dimensional array of numbers.

The columns of the matrix represent the basis vectors after the transformation. So, the first column represents and the second column represents .

Recall that you can find the transformation of any vector by finding how the transformation affects the basis vectors and then combining the results. So, for any vector , the transformation of can be found by:

Another way to notate it is to have the matrix written before the vector:

We can formulate a general equation for such transformation as:

Generalizing to Higher Dimensions

The concept of linear transformations, like basically everything in Linear Algebra, can be extended to higher dimensions.

For example, in , a linear transformation can be represented by a matrix.

Let be a linear transformation between two vector spaces of dimensions and . Then can be represented by an matrix.

Each column will represent the image of the corresponding basis vector in the domain space. This idea can be written as:

Where the are the elements of the matrix representing the transformation, and are the components of the vector . The basis vectors can be represented as:

Then, the transformation of any vector can be found by:

Common Linear Transformations

There are several common linear transformations that are frequently encountered in linear algebra.

Scaling

A scaling transformation stretches or compresses vectors along a particular direction.

This is quite simple - you just multiply all the basis vectors by a scalar.

For example, consider the transformation defined by:

goes from to , and goes from to . Hence, the transformation scales the space by a factor of 2 in the -direction and 3 in the -direction.

Rotation

A rotation transformation rotates vectors in the vector space.

This is a bit more complex than scaling, but it can still be represented by a matrix.

Essentially, you rotate the basis vectors by a certain angle. To do so, we need to use some trigonometry.

Consider the transformation that rotates vectors by an angle counterclockwise. It transforms to . To find , we can observe it geometrically:

(More coming soon)

Proving the Properties of Linear Transformations

To prove that a function is a linear transformation, you need to show that it satisfies the two properties mentioned earlier: additivity and homogeneity.

Consider this matrix:

Each vector is a column vector in the matrix and has components, so is an matrix.

Let be a function defined by .

We can express as and, hence, as:

To show that is a linear transformation, we need to show that it satisfies the two properties.

  • Additivity: for all vectors .
  • Homogeneity: for all vectors and all scalars .

Proving the Linear Transformation's Additivity

To show that is additive, we need to show that .

Let and . Let be some matrix that defines the transformation . Then, is:

Hence, is additive.

Proving the Linear Transformation's Homogeneity

To show that is homogeneous, we need to show that .

Let and be some scalar. Let be some matrix that defines the transformation . Then, is:

Thus, is both additive and homogeneous, and hence a linear transformation.

This shows that any transformation defined via matrix multiplication is a linear transformation.

The Identity Matrix and Standard Basis Vectors

Consider a matrix like this:

This is known as the identity matrix, and to show why it's called that, consider a vector .

Then, the product is:

So, the identity matrix doesn't change the vector it multiplies.

We can also see this without applying the matrix multiplication. Recall that a transformation is defined by how it affects the basis vectors. On the first column, the identity matrix has a in the first row and a in the rest. This means that the first basis vector is unchanged. Same for the second, third, and so on.

This is why the identity matrix is called the identity matrix - it doesn't change the vector it multiplies.

The subscript in denotes that the matrix is of size . For instance:

Consider each column of again. We can asign names to the columns of as :

These vectors are the standard basis vectors in . Recall the criteria for a set of basis vectors: they must be linearly independent and span the entire space.

For the first criteria, it's clear that the standard basis vectors are linearly independent. In the first row, only the first element is non-zero, and hence cannot be written as a linear combination of the other vectors. The same applies to the second row, and so on.

For the second criteria, consider any vector . Then, can be written as:

This shows that the standard basis vectors span the entire space.

Linear Transformations as Matrix Vector Products

Previously, we derived the matrix representation of a linear transformation. However, we did not prove that any linear transformation can be represented as a matrix-vector product.

Recall that any vector can be written as a linear combination of the basis vectors:

Now, consider applying a linear transformation to :

Using the linearity and homogeneity of , we can write this as:

We can represent this as a matrix-vector product.

Recall that a transformation on a vector can be evaluated by multiplying each basis vector by the corresponding component of the vector and summing the results.

Hence showing that any linear transformation can be represented as a matrix-vector product.

Approaches to Reasoning

We have just proven that any linear transformation can be represented as a matrix-vector product.

However, it's also important to notice that we have just shown this in two different ways:

  1. In the conceptual introduction, we reasoned through it using intuition. It was easy to understand how the transformation affects the basis vectors and then how it affects any vector.
  2. In the proof, we reasoned through it using the properties of linear transformations. We showed that the transformation satisfies the properties of additivity and homogeneity, and hence is a linear transformation.

Both approaches are valid and useful in different contexts. The conceptual approach is great for understanding the idea behind linear transformations, while the proof is useful for formalizing the concept.

This is a common theme in mathematics.

Example Problem: Deriving the Matrix Representation of a Linear Transformation

Consider a linear transformation :

Find the matrix representation of this transformation.

To find the matrix representation of this transformation, we need to find how it affects the standard basis vectors.

Let and . Then, is:

Similarly, is:

Hence, the matrix representation of the transformation is:

And,

This shows that the transformation can be represented as a matrix-vector product.

Linear Transformations of Subsets

We've seen how linear transformations affect individual vectors. This can be extended to subsets of vectors, and we can see how linear transformations affect entire regions of the vector space.

Consider three vectors in :

Consider a line segment connecting and .

To construct an expression of this line, consider:

  1. Starting at .
  2. Moving in the direction of .
  3. Doing this for all points in the interval - at , you're at , and at , you're at .

Hence, the line segment can be expressed as:

A visual representation of this line segment is shown below:

Doing this for all three vectors:

The visualization of these line segments is shown below:

We can asign a set, , to be the union of these line segments:

Now imagine applying a linear transformation to each of these line segments.

Recall that these line segments are defined in terms of certain vectors and a parameter . So, to transform this whole set, we need to transform each of these vectors and keep the parameter intact.

Below is the visualization for the following transformation:

Before doing the calculation, let's understand this from an intuitive perspective.

Notice how, in the transformed space, the line segments are still line segments. Recall that visually, linear transformations preserve line distances and line parallelism. This is a visual representation of that.

Additionally, notice how, both in the original and transformed spaces, the line segments are still connected at the endpoints, and the endpoints are the transformed vectors. This means that to create the new set of line segments, we only need to transform the endpoints of the original line segments.

Next, let's prove this mathematically.

First, apply the transformation to . Using the additivity of linear transformations:

Then, using the homogeneity of linear transformations:

And using the additivity again:

This is known as the image of the line segment under the transformation .

Similarly, we can find the image of and under the transformation :

Notice what we've done here - we have a set of infinite vectors that span some line segment. Instead of applying the transformation to every single one of these vectors (which is impossible), we've applied the transformation just to the endpoints of the line segment and then used the linearity of the transformation to find the transformation of the entire line segment.

This reflects a powerful property of linear transformations which we have already shown many times - you don't need to know how it affects every single vector in the space.

Linear Transformations of Subspaces

We've seen how linear transformations affect individual vectors and subsets of vectors. Now, we can extend this to subspaces of the vector space.

Recall what a subspace is - it's a subset of the vector space that is itself a vector space. It contains the zero vector, is closed under addition, and is closed under scalar multiplication.

This means that, for any in the subspace:

  1. is in the subspace.
  2. is in the subspace.
  3. The zero vector is in the subspace.

What we now want to prove is as follows:

If is a subspace of and is a linear transformation, then is a subspace of .

To show that is a subspace, we need to show that it satisfies the three properties of a subspace.

Proving Closure Under Addition for T(S)

Let . Then, .

By the additivity of linear transformations:

By the definition of a subspace, . Hence, , thus showing that is closed under addition.

Proving Closure Under Scalar Multiplication for T(S)

Let and be a scalar.

By definition, . If we multiply by , we get , which is in .

Applying the homogeneity of linear transformations:

Since subspaces are closed under scalar multiplication, . Hence, , showing that is closed under scalar multiplication.

Proving the Zero Vector is in T(S)

This property is actually quite interesting.

Recall that subspaces are closed under scalar multiplication. That is, for any , for any scalar . That means we can just take and get the zero vector.

Hence, the zero vector is in and , so the zero vector is in .

This completes the proof that is a subspace of .

This is a powerful result - it shows that linear transformations preserve the structure of subspaces.

Linear Transformations of the Entire Vector Space

We've sort of "gone up" in terms of the subsets we've been considering:

  1. We started with individual vectors.
  2. We then moved to subsets of vectors.
  3. We then moved to subspaces of the vector space.

Now we're going to consider the entire vector space.

Consider a linear transformation . Then, consider the image of the entire vector space under this transformation, .

The range of can be defined as:

The range of is the set of all possible outputs of . It is a subset of the codomain, . All points from are mapped to points in the range of .

Hence, is the set of all possible outputs of .

Since this is every possible output of , it's just called the image of , or .

Recall that any linear transformation can be represented as a matrix-vector product. Hence:

Recall that a linear transformation is defined by how it affects the basis vectors. Since the basis vectors span the entire vector space, the image of the entire vector space is the span of the transformed basis vectors. This is known as the column space of the matrix representation of the linear transformation.

Hence, the image of the entire vector space under a linear transformation is the column space of the matrix representation of the transformation:

Since the column space contains all possible outputs of the transformation, it can be used to determine whether certain equations have solutions.

For example, consider the equation .

IF , this means is not in the image of the transformation, and hence the equation has no solution.

Algebraic Uses of Matrices

In the most fundamental sense, a matrix is a rectangular array of numbers.

A matrix might look like this:

Arrays like these are used to represent data in a structured way. We can use matrices to represent a wide variety of data and perform operations. For instance, we can use matrices to represent the coefficients of a system of equations.

Example Problem: Solving a System of Equations Using Matrices

The following system of equations is given:

Solve the system of equations for , , , and .

(Source)

Firstly, take note that there are four unknowns and three equations. This means that there are multiple solutions to the system of equations.

Instead of solving the equations for exact values, we will end up with a general solution with constraints.

The coefficient matrix is the matrix of the coefficients of the variables in the system of equations:

The coefficient matrix is:

We can augment the coefficient matrix with the right-hand side of the equations:

It might not be immediately clear why we're doing this - we could've just solved the equations directly.

One reason is that it just saves time; we won't have to write , , , and every time.

We can apply the same operations to the augmented matrix as we would to the system of equations. For example, we can add a multiple of one row to another, or swap two rows.

The leading entry in a row is the first non-zero element in the row. We want to make all leading entries 1, and all other entries in the column 0.

Additionally, for every row with a leading 1, all other entries in the column should be 0.

For example, this is a matrix that satisfies the conditions:

This is called the reduced row-echelon form (RREF) of the matrix. The RREF of a matrix is denoted as .

Recall our augmented matrix:

We can perform operations on the matrix to get it into RREF.

Denote row 1 as , row 2 as , and row 3 as for simplicity. We will iterate through every column to get the matrix into RREF.

First, notice that already has a leading 1. Hence, we want to make all other entries in the column 0.

This sets the first column in to 0. Next, to set the first column in to 0, we can subtract from :

Now, the first column is in RREF.

has a leading , so we want to make that instead. This can be achieved simply by multiplying by :

To set the third column in to 0, we can add to :

We zeroed out , and can now move on to . Replace with :

Recall the criteria for RREF:

  • The leading entry in each row is 1.
  • The leading 1 in each row is the only non-zero entry in its column.

Our matrix now satisfies these conditions:

Some more terminology to note for this matrix:

  • Each leading 1, as the only non-zero entry in its column, is called a pivot entry.
  • The row with all zero entries is known as the zeroed-out row. By convention, in RREF, the zeroed-out row is at the bottom of the matrix.
  • The process we have just done to achieve RREF is called Gaussian elimination.

To illustrate how this actually helps us solve the system of equations, let's rewrite the matrix in terms of the variables:

The variables associated with the pivot entries are known as pivot variables. In our case, the pivot variables are and .

The others are called free variables. Recall that our system of equations had four unknowns, but only three equations, meaning we can't derive exact values for all variables. The free variables are the ones that can take on any value.

Rewrite the system of equations in terms of the pivot and free variables:

That's the general solution to the system of equations. To help with visualization, we can use a vector to represent the solution:

We can write the solution in terms of the free variables:

Then, separating the pivot and free variables into separate vectors:

Now, the vector point is a linear combination of the free variables. We can find the range of solutions by considering the span of the column vectors (, , and ).

You can think of the vector as starting at the origin and moving in the direction of the column vectors:

  • Since is "fixed", it will always go to .
  • Then, it can add or subtract any multiples of and .

With two free variables, the solution space is then a plane in .

Example Problem: Solving Systems for Exact Values

Solve the following system of equations for , , and :

Since we have three equations and three unknowns, we can solve for exact values.

The coefficient matrix is:

The augmented matrix is:

We can perform Gaussian elimination to get the matrix into RREF. Instead of going through the process, a table of the steps is shown below:

Row OperationsSystemAugmented Coefficient Matrix

Zeros out the first column in .

Zeros out the first column in .

Zeros out the second column in , since the leading coefficient in is in the second column.

Zeros out the second column in , for the same reason as above.

Makes the leading coefficient in 1 for RREF.

Zeros out the third column in .

Zeros out the third column in .

Our resulting matrix is in RREF:

Notice that there are no free variables in this case, since we have three equations and three unknowns.

The solution to the system of equations is then:

The solution is unique, i.e. there is only one solution to the system of equations.

Additionally, the key is that these steps can be done without using a matrix, but the matrix representation makes it more convenient and easier to follow.

Example Problem: Solving Systems with No Solution

Solve the following system of equations for , , , and :

(Source)

The augmented coefficient matrix is:

The table below shows the steps to get the matrix into RREF:

Row OperationsSystemAugmented Coefficient Matrix

Zeros out the first column in .

Zeros out the first column in .

Zeros out the third column in , since the leading coefficient in is in the third column.

Zeros out the third column in .

The matrix is in RREF:

However, there's a problem! Notice the zeroed-out row at the bottom of the matrix. Rewriting it in terms of the variables:

This is a contradiction, since . Therefore, the system of equations has no solution.

Matrices and Dot Products

Matrices have a duality with dot products in that they can be used to represent dot products.

Consider the matrix :

Representing this in terms of row vectors:

Recall the matrix-vector product:

This shows that the matrix-vector product is equivalent to the dot product of the rows of the matrix with the vector.

Another way to think about this is in terms of linear transformations. We can consider a linear transformation that projects a vector onto some line in space.

This projection can also be expressed as the dot product of the vector with the unit vector in the direction of the line.

Since this transformation transforms a vector into a scalar, it can be represented as a matrix.

Finally, let's consider how this transformation affects the basis vectors. For :

Notice the symmetry - the projection of onto the line is the same as the projection of the line onto . The projection of the line onto is just the -coordinate of the projection of onto the line.

The same logic applies to . As such:

Then, the transformation matrix is:

When applied to a vector :

For non-unit vectors, the concept is similar. Recall the geometric definition of the dot product:

Hence, the linear transformation would be the projection of onto , scaled by the length of .

Hence, there's a duality between matrices and dot products. This is also a nice non-rigorous way to show the commutativity of dot products (i.e. ).

Furthermore, this offers another way to show the dot product's definition in terms of the components. Going back to the matrix-vector product:

In summary:

Null Space

Recall the definition of a subspace: a subset of a vector space that is closed under addition and scalar multiplication.

Consider the following equation:

Let be an matrix, and be an vector.

Essentially, this equation considers all the vectors that are transformed to the zero vector by the matrix .

Consider the set of all solutions to this equation:

Denote this set as .

We can use the criteria for a subspace to determine if this set is a subspace of .

  1. Contains the zero vector - .

    • This is true, since .
  2. Closed under addition - if , then .

    Since , and . Then, .

    Hence, .

  3. Closed under scalar multiplication - if , then .

    Since , . Then, .

    Hence, .

This set is hence a subspace of and is known as the null space of . To reiterate:

Null Space and Linear Independence

Recall that a set of vectors is linearly independent if no vector in the set can be expressed as a linear combination of the others. This means that you can't write one vector by adding or subtracting multiples of the others.

Consider a matrix :

Recall the definition of the null space of :

For to be valid, must be an vector:

Rewrite the null space equation in terms of the column vectors of and :

Recall the condition for linear independence: a set of vectors is linearly independent if the only solution to the above equation is .

Also recall that are the components of . As such, if the column vectors of are linearly independent, then the only solution to the equation is .

Example Problem: Evaluating Null Space of a Matrix

Determine the null space of the matrix:

(Source)

The null space of is the set of all vectors such that .

Since is a matrix, must be a vector. Hence we can construct the equation:

Expand this through matrix-vector multiplication:

We can represent this system of equations as an augmented coefficient matrix (denoted ):

Recall that the solution to this system of equations can be found by performing Gaussian elimination to get the matrix into Reduced Row Echelon Form (RREF). Since the augmented part is all zero, it'll always be regardless of any operations performed on it. Hence, we shall only focus on the matrix part.

The RREF of the matrix is:

Meaning:

Solving for the pivot variables and in terms of the free variables and :

Recall that we want to find the set of all vectors such that . We can write in terms of the free variables and :

Hence, our solution (null space) is the linear combination of the vectors and with any scalar and . Another way to write this is through the span:

Notice that in , the matrix is essentially . Hence, what we are really solving is:

Hence:

Example Application: Kirchhoff's Current Law (KCL)

KCL is a fundamental law in electrical engineering that is used to analyze circuits.

In a circuit, there are points where wires meet, called junctions.

KCL simply states that the total current entering a junction must equal the total current leaving that junction. Think of it like water flowing through pipes; the amount of water flowing into a junction must equal the amount flowing out.

In electrical engineering, circuits are often represented in matrix form. Create a matrix where each row represents a junction in a circuit, and each column represents a wire. If a wire enters a junction, it is represented as in the matrix, and if it leaves a junction, it is represented as .

Consider the following circuit:

    Wire 1
|
--- <- Junction 1
| |
Wire 2 |
| |
--- <- Junction 2
|
Wire 3

It has two junctions and three wires., hence it is represented as a matrix:

The interesting thing is that the null space of determines the set of all current distributions that satisfy KCL.

Consider the null space of :

We can solve for the null space of :

From the first equation, , and from the second equation, . Hence, .

Then, the null space of is:

This means that the current must be the same in all wires to satisfy KCL.

Column Space and Subspaces

Previously, we discussed the column space of a matrix, which is the span of the column vectors of the matrix:

Let be an matrix with columns. Then, it can be expressed as:

The column space of is the span of the column vectors, i.e. the set of all linear combinations of the column vectors:

Let's consider whether the column space of a matrix is a subspace of . Let and be vectors in .

Both and can be expressed as linear combinations of the column vectors of :

  1. Closure under addition - .

    Since and are in , they can be expressed as linear combinations of the column vectors of :

    Then, .

    Hence, .

  2. Closure under scalar multiplication - .

    Since is in , it can be expressed as a linear combination of the column vectors of :

    Then, .

    Hence, .

Therefore, the column space of a matrix is a subspace of .

Example Problem: Determining Null Space and Column Space

Determine the null space and column space of the matrix:

(Source)

First, we shall find the column space of the matrix. Recall that the column space is the span of the column vectors, or the new basis vectors, of the matrix: